-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Implemented ordering for expanded iterators #8741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Added a test for nested iterator execution ordering. (Failing at commit time!)
When a graph has nested iterators, some "ready to run" node combinations do not actually belong together. Previously, the scheduler would still try to build nodes for those mismatched combinations, which could cause the same work to run more than once. This change skips any combination that is missing a valid iterator parent, so nested iterator expansions run once per intended item.
lstein
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be able to review this PR, but I've never quite understood how to use the collection/iteration functions in the workflow editor and don't think I have the expertise to identify edge cases and so forth.
If no one else is able to review, I'll come up to speed and give it a shot.
|
Following up on this, maybe you can provide an example workflow that illustrates the problems with the previous non-deterministic execution ordering and demonstrates how the PR fixes this? |
As it's non-deterministic, I can't guarantee that any example will show anything at all. The problem is that when iteration expands, prior to this change the resulting nodes are treated like any others and there are no guarantees that the collection of expanded nodes will execute in any order at all. After this change, the first iteration's nodes should always complete before the second iteration's start, the second's before the third's, etc. Hopefully this change accomplishes that; this section of code is getting very hard to read and understand and would likely benefit from a major refactor at some point. |
|
Drafting this as the collect node needs some adjustment even though execution seems to happen in order. |
|
Sample workflow: OrderingTestWorkflow.json Before this PR, examining the output of the Integer Addition node: After: But, as I indicated above, there's something up with the Collect node. |
|
Collect ordering is now resolved. Try the test workflow. |
Summary
Feature: best-effort in-order execution for nodes expanded by
IterateInvocation, subject to ready state.Why: iteration-expanded work could run in an arbitrary order when multiple iteration branches become ready at the same time, which reduces determinism and makes behavior harder to reason about.
How: compute a cached per-exec-node "iteration path" (outer to inner indices for nested iterators, respecting collector boundaries) and use it to ordered-insert ready nodes into the existing per-class ready queues. FIFO is preserved for equal iteration paths, and no blocking is introduced.
Related Issues / Discussions
QA Instructions
Create a collection, note the ordering, then iterate and pass values to a series of other nodes. Check that the outputs use the source collection's items in order.
Merge Plan
Checklist
What's Newcopy (if doing a release after this PR)